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00 (54) Title: DATA CLASSIFIER OUTPUT INTERPRETAHON 

^ (57) Abstract: A method, ^jparatus, and comptner software are provided whereby to associate a readily user-interpretable reason 
with an output of a 8iq)ervised training data classifier. Reasons are associated witfi one or more members of a sequence of training 
vectors and subsequently associated, in operation, with a given classifier input by comparing the classifier input vector with training 

^ sequence vectors. A measure of confidence in the selected reasons is derived by comparing the classifier input veaor with die 
corresponding training ii^uts with which the selected reasons were associated and calculating a measure of their closeness. 
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DATA CLASSIFIER OUTPUT INTERPRETATION 
FIELD OF THE INVENTION 

The present invention relates to a method and apparatus for interpretation 
5 - of data classifier outputs and a system incorporating the same. 

BACKGROUND TO THE INVENTION 

Trainable data classifiers (for example Neural Networks) can learn to 
classify given input vectors into the required output group with a high 
degree of accuracy. However, a known limitation of such data classifiers 
10 is that they do not provide any explanation or reason as to why a 
particular decision has been made. This "black box" nature of the decision 
making process is a disadvantage when human users want to be able to 
understand the decision before acting on it. 

Such data classifiers can be split into two main groups: those which have 
15 a supervised training period and those which are trained in an 
unsupervised manner. Those trained in a supervised manner (i.e. 
supervisedly trained data classifiers) include, for example, Multi Layer 
Perceptrons (MLPs). 

In order for a supervisedly trained data classifier (e.g. Neural Network) to 
20 be trained, a training set of examples has to be provided. The examples 
contain associated input/output vector pairs where the input vector is what 
the data classifier will see when performing its classification task, and the 
output vector is the desired response for that input vector. The data 
classifier is then trained over this set of input/output pairs and learns to 
25 associate the required output with the given inputs. The data classifier 
thereby learns how to classiify the different regions of the input space in 
line with the problem represented in the training set. When the data 
classifier is subsequently given an input vector to classify it produces an 
output vector dependant upon the previously learned region of the input 
30 space that the Input vector occupies. 
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In the case of some very simple classification problems, the "reasoning" 
made by the data classifier may, perhaps, be intuitively guessed by a 
user. However, Neural Networks are typically used to solve problems 
without a well bounded problem space and which have no solution 
obvious to humans. If the rules defining such a problem were clear then a 
rule-based system would probably provide a more suitable classifier than 
a supervisedly trained data classifier such as a Neural Network. A typical 
data classifier application involves complex, high-dimensional data where 
the rules between input vectors and correct classification are not at all 
obvious. In such situations the supervisedly trained data classifier 
becomes a complex, accurate decision maker but unfortunately offers the 
human end user no help understanding the decision that it has reached. 
In many situations the end user nevertheless wishes to understand, at 
least to some degree, why a given supervisedly trained data classifier 
data classifier has reached a decision before the user can act upon that 
decision with confidence. 

In the past, much work has been directed to extracting rules from Neural 
Networks, where people have attempted to convert the weights contained 
within the Neural Network topology into if-then-else type rules [Andrews, 
R., Diederich, J., &Tickle, A. (1995): "A survey and critique of techniques 
for extracting rules from trained artificial neural networks" in Knowledge 
Based Systems, 8(6), pp.373-389]. This work has had only limited 
success and the rules generated have not been clear, concise, nor readily 
understandable. Work has also been performed which concentrates on 
the problem as a rule inversion problem; given a subset Y of the output 
space, find the reciprocal image of Y by the function f computed by the 
Neural Network [Maire. F. (1995): ''Rule-extraction by backpropagation of 
polyhedra" In Neural Networks. 12(4-5), pp. 717-725. Pub. 
Elsevier/pergamon, ISSN 0893-60801- This method back-propagates 
regions from the output layer back to the input layer. Unfortunately, whilst 
this method is theoretically sound, the output from this method is once 
again not readily understandable to the user, and so does not solve the 
problem of helping the user to understand the reason for a Neural 
Network's decision. 

Other methods which have been tried In the past divide each Individual 
value in the Input vector into different categories (percentile bins). This 
technique is described In, for example, US 5,745,654. Each percentile bin 
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has associated with it an explanation describing the meaning of the 
associated individual input value rather than for the whole vector of input 
values. A reason is then associated with the output vector, selected as 
being the reason associated with the most significant input variable in the 
5 input vector. This method does not take into consideration the facts that 
the data classifier classifies on the input vector as a whole and that 
relationships between input variables are often significant. It also requires 
some definition of relative significance of the component variables of an 
input vector which is not always meaningful. 

10 OBJECT OF THE INVENTION 

The invention seeks to provide an improved method and apparatus for 
interpreting outputs from supervisedly trained data classifiers. 

SUMMARY OF THE INVENTION 

The present invention provides to a user a textual (or other representation 
15 of a) reason for a decision made by a supervisedly trained data classifier 
(e.g. Neural Network). The reason may be presented only when it is 
required, in a manner that does not hinder the performance of the data 
classifier, is easily understood by the end user of the data classifier, and 
is scaleable to large, high dimensional data sets. 

20 According to a first aspect of the present invention there is provided a 
method of operating a supervisedly trained data classifier, comprising the 
steps of: generating an output vector responsive to provision of an input 
vector; associating a reason with said classifier output vector responsive 
to a comparison between said classifier input vector and a previously 

25 stored association between a training vector used to train said classifier 
and said reason. 

Advantageously, the method of operation facilitates later interpretation of 
the classifier outputs by a user, and is scaleable to large, high 
dimensional data sets. 

30 Preferably, the method additionally comprises the step of: presenting to a 
user information indicative of said output vector, of said reason, and of 
their association. 
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Advantageously, the association enables the user to interpret the 
classifier outputs more rapidly and more directly. 

Preferably, the method additionally comprises the step of: associating 
with said reason a measure of a degree of confidence with which said 
reason is associated with said Input vector. 

Preferably, the method additionally comprises the step of: presenting to 
said user information indicative of said measure of a degree of 
confidence. 

Preferably, the method wherein said degree of confidence is calculated 
responsive to a comparison between said training vector and said input 
vector. 

Preferably, said degree of confidence is calculated as a distance between 
said Input vector and an input vector component of said training vector. 

Preferably, said distance is a Euclidean distance. 

Advantagously, these measures are simple to calculate and provide a 
good and intuitively easy to understand measure of confidence. 

In a preferred embodiment, a plurality of reasons may be associated with 
said classifier output vector responsive to comparisons between said 
classifier input vector and a plurality of previously stored associations 
between training vectors used to train said classifier and said reasons. 

Preferably, the method additionally comprises the step of: associating 
with each said reason a measure of a degree of confidence with which 
said reason is associated with said input vector. 

Preferably, the method additionally comprises the step of: presenting to 
said user information indicative of said measure of a degree of 
confidence. 

Advantageously, this allows the user to identify and to concentrate 
interpretation effort on reasons allocated a high degree of confidence. 

Preferably, said information is presented sorted according to said 
measures of confidence. 
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Advantageously. this allows the user to identify reason allocated the 
highest degree of confidence more readily, thereby speeding the user's 
interpretation of the presented information. 

Preferably, each said reason is presented selectively responsive to a 
comparison between said measure of degree of confidence associated 
with said reason and a threshold criteria. 

Advantageously, this allows the amount of information presented to a user 
to be limited, so that the user is not swamped with large numbers of 
potential reasons, some of which may have only been allocate a small 
degree of confidence. 

The invention also provides a method of operating a supervisedly 
trainable data classifier, comprising the steps of: associating a reason 
with at least one training vector; training said data classifier using said 
training vector; providing an input vector to said data classifier whereby to 
generate an output vector; associating said reason with said output 
vector responsive to a comparison between said input vector and said at 
least one training vector. 

In a preferred embodiment, said data classifier comprises a neural 
network. 

According to a further aspect of the present invention there is provided a 
data classifier system, comprising: a supervisedly trained data classifier 
arranged to provide an output vector responsive to receipt of an input 
vector; a store containing an association between a reason and a training 
vector previously used to train said classifier; a data processing 
subsystem arranged to associate said reason with an output vector 
received from said data classifier, responsive to a comparison between 
said input vector and said training vector. 

Preferably, the data classifier system additionally comprises: a computer 
display arranged to present an Indication of said reason and an indication 
of said output vector to a user. 

Preferably, the data classifier system additionally comprises: a data 
processing subsystem arranged to calculate a measure of a degree of 
confidence with which said reason is associated with said input vector. 
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Preferably, said display is arranged to present an indication of said 
degree of confidence to said user. 

According to a furtlier aspect of the present invention there is provided an 
anomaly detection system comprising a data classifier system according 
the to present invention. 

According to a further aspect of the present invention there is provided an 
account fraud detection system comprising a data classifier according the 
to present invention. 

According to a further aspect of the present invention there is provided a 
telecommunications account fraud detection system comprising a data 
classifier according the to present invention. 

The invention also provides for a system for the purposes of digital signal 
processing which comprises one or more instances of apparatus 
embodying the present invention, together with other additional 
apparatus. 

According to a further aspect of the present invention there is provided 
computer software in a machine-readable medium arranged to perform 
the steps of: receiving an input vector; providing an output vector 
indicative of a classification of said input vector; associating a reason with 
said classifier output vector responsive to a comparison between said 
classifier input vector and a previously stored association between a 
training vector used to train said classifier and said reason. 

Preferably, the computer software is additionally arranged to perform the 
steps of: associating with said output vector a measure of a degree of 
confidence with which said reason is associated with said input vector. 

Preferably, the computer software is additionally arranged to perform the 
steps of: associating a reason with at least one training vector for a data 
classifier; training said data classifier using said training vector; providing 
an input vector to said data classifier whereby to generate an output 
vector; associating said reason with said output vector responsive to a 
comparison between said inpout vector and said at least one training 
vector. 
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The preferred features may be combined as appropriate, as would be 
apparent to a skilled person, and may be combined with any of the 
aspects of the invention. 

BRIEF DESCRIPTION OF THE DRAWINGS 

In order to show how the invention may be carried into effect, 
embodiments of the invention are now described below by way of 
example only and with reference to the accompanying figures in which: 

Figure 1 shows a simple example of a classifier Input space and 
corresponding output classification regions; 

Figure 2 shows an example of a system in accordance with the present 
invention. 

DETAILED DESCRIPTION OF INVENTION 

A simple example of training data for a supervised training data classifier 
is the XOR (logical "exclusive or") problem shown in Tablel . Whilst In 
practice problems addressed by data classifiers are typically much more 
complex, the XOR problem is adequate for a clear exposition of the 
invention. 

The XOR classifier has two inputs, Inputi and Input2, and two outputs 
Outputi and Output2. Data values Input on Inputi and Input2 should lie 
in the approximate range 0 to 1 (and ideally ought to be precisely 0 or 1). 
Outputi should be at value 1 (active) if both inputs are the same value; 
Output2 should be active if one, but not both, of the Inputs are active. 



Table 1: XOR training vectors 





Input 1 


Input2 


Outputi 


Output 2 


Vectorl 


0 


0 


1 


0 


Vector2 


0 


1 


0 


1 


Vectors 


1 


0 


0 


1 


Vector4 


1 


1 


1 


0 
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We define Classi to be the classification if Outputi is 1 and Output 2 is 0; 
Class2 to be the classification if Outputi is 0 and Output 2 is 1 . 

Referring now to Figure 1, there is shown a representation of the four 
training input vector values of Table 1 in their two dimensional space. It 
also shows a potential solution to defining the corresponding two 
classificatory regions (Classi and Class2) for this problem. In 
classification mode (i.e. non training mode) the Neural Network will 
determine whether a given input vector falls within Classi or Class2 
dependant upon where the input vector lies in the input space. The 
Neural Network presents its decision in the form of the values on its 
outputs: if Outputi has a value of (or close to) 0 and Output2 has a value 
of (or close to) 1 then the input vector will have been classified as being in 
Class2; if Outputi has a value of (or close to) 1 and Output2 has a value 
of (or close to) 0 then the input vector will have been classified as being in 
Classi. Such a Neural Network can be trained to determine which 
region/classification group the input vector should be assigned to very 
accurately and efficiently. 

In the method presented here, the training vectors (each comprising an 
input-output vector pair) are provided to train a supervised training data 
classifier in the conventional way as described above. However these 
vectors are stored for future reference and, with at least one, and 
preferably with most or all of these vectors, there is stored an indication of 
a reason why the input and output vectors have been so associated, as 
illustrated in Table 2. 



Table 2: XOR training vectors with associated reasons 





Inputi 


Input2 


Outputi 


OutDUt2 


Reason 


Vectorl 


0 


0 


1 


0 


Both inputs off - 
Classi 


Vector2 


0 


1 


0 


1 


Only second input 
on - Class2 


Vectors 


1 


0 


0 


1 


Only first input on - 
Ciass2 


Vector4 


1 


1 


1 


0 


Both Inputs on - 
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Ciassi 



This association of reasons to training vectors may occur prior to training 
of the classifier or after, and it may be made all at one time, or 
incrementally during operation of the classifier during or after training, 
dependent on when an understanding of the requisite reasons becomes 
known. 



Once trained using the training sequence, an input vector may be 
presented to the trained classifier in the usual way, and a corresponding 
output vector is generated indicative of the classification of the input 
vector. 

The input vector is then compared to the stored training vectors having 
associated reasons. Each such reason can then be presented to the user 
along with an indication of the degree of confidence that may be placed in 
associating that specific reason with classification of the present input 
vector. 

The preferred indication of such confidence is a measure of the distance 
(or closeness) between (a) the input vector giving rise to the present 
classification output, and (b) the input vector in the training sequence 
associated with each given reason. This measure may be used to inform 
the user of the rationale behind the classification. This can be done (a) by 
displaying the value of the closeness measure somewhere near the 
reason, (b) by ordering the list of reasons presented, or (c) by restricting 
the set of reasons presented according to given threshold criteria (e.g. 
either to a predetermined maximum number of reasons, or to those 
reasons whose degree of confidence measure is less than some given 
value, or other applicable threshold criteria). Alternatively a user may first 
be presented with an option to request display of reasons, so that reasons 
are displayed only if explicitly requested. 

The provision of a measure of closeness both enables the detection of 
new input values that are not represented in the training data set and 
gives the user an indication of how relevant the reason given is to the 
newly classified input vector. 
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A most preferred measure for use in determining the closest training 
vector to tlie vector being classified is the Euclidean distance. Given a 
vectorx, where 

Xi=[xM,XB, Xi„] (1) 

the Euclidean distance, [xj -x^, between two such vectors, x, and x^, is 
defined by Equation (2). 

Ih-X|1| = JZ(^K-X,j' (2) 

This is an acceptable distance measure for classifier inputs since the 
input vectors will already have been pre-processed using some form of 
standardisation or normalisation prior to use with the Neural Network. 
However, the method proposed here is not limited to the use of Euclidean 
distance: other distance measures could be used just as effectively 
accordingly to circumstances. 

The use of Euclidean distance gives a numerical value representing how 
close the two vectors are to each other. The values returned can be used 
to rank the order of closeness of an input vector to the corresponding 
portions of training vectors and their associated reasons. However the 
values returned cannot as easily be used to inform the user how close the 
vectors are to each other. This is because the absolute range of values 
will be dependant upon the nature of the data set being classified, 
especially if the data set has not undergone a standard normalisation. A 
particular absolute value (e.g. 10) returned may be indicative that two 
vectors are very close for some datasets, and quite distant for a different 
data set. The nearness values presented to the user are therefore 
preferably converted Into a standardised value (for example as a 
percentage of a maximum possible distance value) to be easily 
understandable to the end user. 

To achieve this a maximum Euclidean distance between any two vectors 
in the training set may be determined and this distance used as a 
reference to determine the percentage, or other relative distance, values. 

For example, again using the XOR training data set, the maximum 
Euclidean distance between any two input vectors is ^2. Given an input 
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vector of [0.05,0.95] 100 for which to provide a reason, the closest input 
training vector is [0,1] with a Euclidean distance between them 110 of 
0.07. This converts into a percentage of approximately 5. The second 
closest are input training vectors [0,0] and [1,1] with Euclidean distances 
120, 130 of 0,67 from [0.05,0.95], giving a percentage of 67; input 
training vector [1,0] lies at a Euclidean distance 140 of 1.34 from 
[0.05,0.95], giving a percentage of 95. 

The results returned from the reason generation method, in response to a 
request for vector [0.05,0.95] would be: 

Ranking 1 : "Just second input on - Class2", closeness of 5% 

Ranking 2: "Both inputs off - Classl", closeness of 67% 

Ranking 3: "Both inputs on - Classl", closeness of 67%. 

Ranking 4: "Just first input on • Class2", closeness of 95%. 

The results given clearly show to the user that the input given is close to 
just one training vector and the reason given is highly likely to provide an 
accurate description of the reason for the Neural Network's classification 
of this input vector. 

Whilst the representations both of reason and of closeness shown here 
are textual in nature, the invention is not limited to textual representation: 
other representations (e.g. graphical or aural indications) could in practice 
also be adopted according to the nature of the data or the context of the 
classification. 

As can be seen even in this simple example, a clearly understandable 
rationale for the classifier's behaviour is presented to a user. A key 
feature of this method is that it remains simple for the user however large, 
multi-dimensional, or complex the data being classified. Another key 
feature is that there is no additional cost on Neural Network performance, 
and other computational cost only occurs when a reason is required. 

In some applications there will only be a need for a reason in a relatively 
small proportion of the vectors being classified. 
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The selection of which outputs are used to extract a reason from the 
training data may be predetermined according to an automatic method, or 
may be on specific request by a user or any combination as appropriate. 

In, for example, an embodiment in which a classifier is used to identify 
anomalies in the stream of input vectors, reasons may be identified and 
associated with output vectors only for some output vectors of interest. 
This avoids unnecessary processing overhead arising from data lookup 
for output vectors for which no reason is required. One example of such 
an embodiment is a system arranged to detect telecommunication (or 
other) account fraud in which input vector data relates to account usage 
data and classification outputs are indicative of specific types of account 
fraud or of normal account usage. 

Referring now to Figure 2, and continuing to use the XOR example, an 
input vector [0.05. 0.95] is presented 201 to a trained data classifier 200 
and generates 202 a output vector [0, 1] indicative of classification into 
Class2. 

Both input 206 and the output 202 of the data classifier are provided to a 
processor 210 which uses the classification output to present an 
indication 205 of the actual classification user, whilst using the classifier 
input 206 with reference to a stored training vector 220 to identify possible 
reasons and a measure of their correlation to the given input vector. 

The information can then be further processed 210 to enable its 
communication 205 and presentation 230 to a user. 

The information is processed 210 and communicated 205 to the user 
showing the classification (Class2), and the identified reasons along with 
their respective measures of closeness: 

67% - Both inputs off - Classi 

5% - Only second input on - Class2 

67% - Only first input on - Class2 

95% - both inputs on - Classi 
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In order to enable a measure of closeness between current input vector 
and training input vector information to be calculated, the input vectors 
must be provided 206 to the processor 210. 

The method associated with the present invention may be implemented 
as a program for a computer, with steps corresponding to the method 
steps as would be apparent to a person skilled in the art. 

In summary, then, a method, apparatus, and computer software are 
provided whereby to associate a readily user-interpretable reason with an 
output of a supervised training data classifier. Reasons are associated 
with one or more members of a sequence of training vectors and 
subsequently associated, in operation, with a given classifier input by 
comparing the classifier input vector with training sequence vectors. A 
measure of confidence in the selected reasons is derived by comparing 
the classifier input vector with the corresponding training inputs with which 
the selected reasons were associated and calculating a measure of their 
closeness. 

Any range or device value given herein may be extended or altered 
without losing the effect sought, as will be apparent to the skilled person 
for an understanding of the teachings herein. 
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CLAIMS 

1. A method of operating a supervisedly trained data classifier, 

comprising the steps of: 

generating an output vector responsive to provision of an input 
5 vector; and 

associating a reason with said classifier output vector 
responsive to a comparison between said classifier input vector and a 
previously stored association between a training vector used to train said 
classifier and said reason. 

10 2. A method according to claim 1 additionally comprising the step 

of: 

presenting to a user information indicative of said output vector, 
of said reason, and of their association. 

3. A method according to any one of claims 1-2 additionally 
IS comprising the step of: 

associating with said reason a measure of a degree of 
confidence with which said reason Is associated with said input vector. 

4. A method according to claim 3 additionally comprising the step 
of: 

20 presenting to said user information indicative of said measure of 

a degree of confidence. 

5. A method according to any one of claims 3-4 wherein said 
degree of confidence is calculated responsive to a comparison between 
said training vector and said input vector. 

25 6. A method according to any of claims 3-5 wherein said degree of 

confidence is calculated as a distance between said input vector and an 
input vector component of said training vector. 

7. A method according to claim 6 wherein said distance is a 

Euclidean distance. 
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8. A method according to claim 1 wherein a plurality of reasons 
are associated with said classifier output vector responsive to 
comparisons between said classifier input vector and a plurality of 
previously stored associations between training vectors used to train said 
classifier and said reasons. 

9. A method according to claim 8 additionally comprising the step 
of: 

associating with each said reason a measure of a degree of 
confidence with which said reason is associated with said input vector. 

10. A method according to any one of claims 8-9 additionally 
comprising the step of: 

presenting to said user information indicative of said measure of 
a degree of confidence. 

11. A method according to claim 10 wherein said information is 
presented sorted according to said measures of confidence. 

12. A method according to any one of claims 10-11 wherein each 
said reason is presented selectively responsive to at least one of a 
comparison between said measure of degree of confidence associated 
with said reason and a threshold criteria. 

13. A method of operating a supervisedly trainable data classifier, 
comprising the steps of: 

associating a reason with at least one training vector; 

training said data classifier using said training vector; 

providing an input vector to said data classifier whereby to 
generate an output vector; 

associating said reason with said output vector responsive to a 
comparison between said input vector and said at least one training 
vector. 

14. A method according to any one of claims 1-13 wherein said 
data classifier comprises a neural network. 
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1 5. A data classifier system, comprising: 
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a supervisedly trained data classifier arranged to provide an 
output vector responsive to receipt of an input vector; 

a store containing an association between a reason and a 
training vector previously used to train said classifier; 

a data processing subsystem arranged to associate said reason 
with an output vector received from said data classifier, responsive to a 
comparison between said input vector and said training vector. 

16. A data classifier system according to claim 15 additionally 
comprising: 

a data processing subsystem arranged to calculate a measure 
of a degree of confidence with which said reason is associated with said 
input vector. 

17. A data classifier system according to any one of claims 15-16 
additionally comprising: 

a computer display arranged to present an indication of said 
reason and an indication of said output vector to a user. 

18. A data classifier system according to claim 17 wherein said 
display is arranged to present an indication of said degree of confidence 
to said user. 

19. ' An anomaly detection system comprising a data classifier 
system according to any one of claims 15-18. 

20. An account fraud detection system comprising a data classifier 
according to any one of claims 15-18. 

21. A telecommunications account fraud detection system 
comprising a data classifier according to any one of claims 15-18. 

22. Computer software in a machine-readable medium arranged to 
perform the steps of: 

receiving a data classifier output vector; 
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providing an output vector indicative of a classification of said 
input vector; 

associating a reason with said classifier output vector 
responsive to a comparison between said classifier input vector and a 
previously stored association between a training vector used to train said 
classifier and said reason. 

23. Computer software in a machine-readable medium according to 
claim 22 additionally arranged to perform the steps of: 

associating with said output vector a measure of a degree of 
confidence with which said reason is associated with said input vector. 

24. Computer software in a machine-readable medium arranged to 
perform the steps of: 

associating a reason with at least one training vector for a data 

classifier; 

training said data classifier using said training vector; 

providing an input vector to said data classifier whereby to 
generate an output vector; 

associating said reason with said output vector responsive to a 
comparison between said input vector and said at least one training 
vector. 
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